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1.0  Introduction 

The  following  report  is  an  examination  of  concepts  for  the  utilization  of  three- 
dimensional  depth  map  information  for  the  construction  of  features  to  be  employed  by 
neural  network  based  classifiers  in  the  classification  of  underwater  mines.  The 
evaluation  of  the  discriminatory  power  of  a  subset  of  the  identified  three  dimensional 
features  for  mine  detection/classification,  in  the  context  of  actual  or  simulated  data, 
will  be  reported  separately.  The  context  that  forms  the  basis  for  the  current  study 
envisions  the  availability  of  coregistered  side-scan  sonar-derived  intensity  imagery 
and  three-dimensional  depth  map  information  at  the  same  resolution.  Hence  the 
tnree-dimensional  related  features  to  be  described  are  to  be  viewed  as  augmenting 
features  computed  from  side-scan  sonar  imagery. 

The  discussion  that  follows  focuses  separately  in  Sections  2  and  3  on  the 
construction  of  three-dimensional  features  that  use  low-resolution  and  high-resolution 
depth  map  construction  concepts,  respectively,  and  associated  with  the  use  of 
telesounding  and  swath-bathymetry  techniques,  respectively.  Finally,  Section  4 
summarizes  the  study's  conclusions. 

2.0  Low-Resolution  Three-Dimensional  Depth  Map  Utilization  Concepts 

The  discussion  that  follows  will  first  review  the  telesounder  concept  for  the 
construction  of  depth  maps.  Next,  the  combined  use  of  side-scan  sonar  data  and 
height  map  data  in  the  estimation  of  underwater  mine-size-related  features  will  be 
described. 

The  geometry  underlying  the  telesounder  depth  map  construction  concept  is 
pictured  in  Figure  2-1.  Two  acoustic  receivers  mounted  on  a  tow  body  are  positioned 
vertically  with  a  separation  A,  and  with  the  lower  receiver  also  functioning  as  a 
transmitter.  It  is  assumed  that  the  height  of  receiver  #2  above  some  "reference" 
bottom  level,  hR,  is  known,  and  that  the  sea-bottom  terrain  height  above  the 
"reference"  bottom  at  a  ground  range  x,  is  denoted  by  hb(x).  In  addition,  let  rsi(x), 
rS2(x),  denote  the  slant  range  from  receivers  #1  and  #2,  to  the  point  (x,hb(x)), 
respectively. 

The  telesounder  concept  (Ref.  [1])  consists  in  forming  the  signal  z(t)  as 

z(t)  =  h  (t)  +  r2  (tj  (2-1) 
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Figure  2-1.  Telesounder  depth  map  construction  geometry. 

where  ri(*),  r2(#)  denote  demodulated,  complex  signals  associated  with  receivers  #1 
and  #2,  respectively.  For  the  sake  of  simplicity,  it  is  assumed  here  that  reflections  are 
only  received  from  the  specific  bottom  patch  at  (x,  hb(x)).  Then,  assuming  a 
transmitted  complex  signal  of  the  form 

s(t)  =  UT(t}elWot  0-21 


UT(t)  A 


0  <  t  <  T 
otherwise 


(2-3) 


and  assuming  an  idealized  propagation  process  with  no  effect  other  than  delay,  it  can 
be  shown  that  over  the  time  interval  where  the  two  received  pulses  interact,  that  the 
signal  z(t)  approximately  includes  a  factor  of  the  form  f(0  (x))  where 


1  too  A 

2  v 


(2-4) 


and  v  denotes  the  velocity  of  propagation  of  sound.  Now,  letting 
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A  =  k  X 


(2-5) 


where  X  denotes  the  acoustic  wavelength  and  noting  that 

coo 


(2-6) 


it  can  be  seen  that 


f(0)  =  |cos  (tc  k  sin  0)  |  (2-7) 

The  conclusion  to  be  obtained  from  the  above  discussion  is  that  specific  0's 
corresponding  to  the  maxima  of  (2-7)  will  be  associated  with  specific  maxima  in  the 
"interference"  signal  z(t).  Hence,  at  those  specific  0  values,  the  depth  may  be 
deduced  by  the  use  of  the  approximate  equation 

hb(x)  =  hR  -  A  -  rsl  (x)  sin  0(x)  (2-8) 

where  rsi(x)  is  deduced  from  the  time  location  of  the  appropriate  z(»)  maxima. 

From  (2-7)  it  can  be  shown  that  maxima  of  f(0)  occur  at  discrete  0n's  defined 
by 


9n  =  Sm'(a)„  =  0...k  (29) 

For  illustrative  purposes,  assuring  that  the  bottom  in  Figure  2-1  were  flat,  then  the 
ground  ranges  associated  with  the  f(0)  maxima  would  be  specified  by 

xn  =  hR  cot  (0n)  (2-10) 

Table  2-1  lists  0n,  xn,  (xn.i  -  xn)  values  for  a  specific  case  in  which 

jhR  =  40  m 

)k=10  (2-11)7(2-12) 
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Table  2-1.  0  values  and  ground  ranges  associated  with  interference  signal  maxima. 


n 

9n 

xn 

xn-l  "  xn 

0 

0 

oo 

1 

0.10017  rad 

397.98  m 

2 

0.20136  rad 

195.96  m 

202.02  m 

3 

0.3469  rad 

127.19  m 

68.77  m 

4 

0.41152  rad 

91.65  m 

35.54  m 

5 

0.52360  rad 

69.28  m 

22.37  m 

6 

0.64350  rad 

53.33  m 

15.95  m 

7 

0.77540  rad 

40.81  m 

12.52  m 

8 

0.92730  rad 

29.99  m 

10.82  m 

9 

1.1198  rad 

19.37  m 

10.62  m 

10 

tc/2  rad 

0.0  m 

19.37  m 

Hence,  for  the  specific  case  considered  in  Table  2-1,  over  the  first  200  m  of  ground 
range,  the  separation  between  xn  values  varies  from  10.62  m  up  to  68.77  m. 

In  (Ref.  [2]),  a  variety  of  cylindrical-shaped  U.S.  mines  are  discussed,  and  their 
dimensions  are  listed.  The  ranges  of  diameter  and  length  dimensions  are  listed  in 
Table  2-2.  The  above  discussion  of  the  telesounder- based  depth  estimation  technique 
is  sufficient  to  suggest  that  detailed  three-dimensional  surface  and  shape  information 
characterizing  individual  mines  is  not  likely  to  be  obtainable  from  gridded,  telesounder- 
derived  depth  maps.  Hence,  the  discussion  that  follows  suggests  a  way  of  employing 
low-resolution,  telesounder-derived  depth  map  information  to  estimate  mine-size- 
related  features. 


Table  2-2.  Cylindrical  mine  dimensions. 


Minimum 

Maximum 

Diameter 

0.38  m 

0.73  m 

Length 

1.8  m 

4.1  m 

When  human  sonar  operators  interpret  side-scan  sonar  imagery,  they  look  for 
highlight  regions  with  associated  shadows.  The  concept  suggested  here  is  to  make 
use  of  side-scan  sonar-derived  slant-range  and  shadow  extent  information,  combined 
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with  approximate  bottom  map  information,  hb  (x),  in  order  to  estimate  object  diameter. 
In  Figures  2-2  and  2-3,  the  geometry  of  the  object  diameter  calculation  problem  for  the 
case  of  an  object  on  the  bottom,  or  in  the  water  column,  respectively,  is  depicted.  The 
notations  rs,  /s  will  denote  the  slant  range  to  the  object  "highlight"  center  and  the  slant- 
range  extent  of  the  shadow,  respectively.  In  the  case  of  Figure  2-3,  gs  denotes  the 
slant-range  extent  of  the  gap  between  the  center  of  the  "highlight"  and  the  beginning 
of  the  shadow.  The  notations  Xh,  and  {xSj  xsu,  Xj/)  denote  ground  ranges  associated 
with  the  "highlight"  center  and  shadow  edges,  respectively.  Finally,  hQ  and  hwc  denote 
the  diameter  of  the  object  and  the  water  column  height,  respectively. 

SONAR 


Figure  2-2.  Geometry  of  object  diameter  calculation  for  case  of  object  on  the  bottom. 


Figure  2-3.  Geometry  of  object  diameter  calculation  for  case  of  object  in  the 
water  column. 
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The  relationships  that  allow  the  calculation  of  ho,  hwc  are  derived  from  equating 
the  ratios  of  corresponding  sides  of  similar  triangles.  In  the  case  of  an  object  on  the 
bottom. 


hR  _  (ho  +  hb(xh)) 

(rs  +  4  +  A')  (4  +  A') 

(2-13) 

where 

_  hb(xs) 

sin  (9) 

(2-14) 

and 

0  =  cos4 

(2-15) 

and  therefore  implying  that 

110  =  r~h — hR  * hb  (Xh) 

(rs+4  +  A') 

(2-16) 

For  an  object  in  the  water  column  depicted  in  Figure  2-3,  two  sets  of  similar  triangles 
yield  the  relations 

hR  _  (hwc  +  hb  (xjj)) 

(rs  +  gs  +  A'g)  (gs  +  A'g) 

(2-17) 

hR  _  (ho  +  hwc  ~r  hb  (xh)) 

(rs  +  gs  +  4  +  A'u)  (gs  +  4  +  A’u) 

(2-18) 

where 

hb(xsu) 

U  sin  (0U) 

(2-19) 

A'p  =  hb^ 
g  («.) 

(2-20) 
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and 


e  _  cos-t  /xsu  -  xh\ 

\gs+'s  1 

(2-21) 

e /  -  cos-i  (Xfi ' Xh) 

l  g si 

(2-22) 

Relations  (2-17)  -  (2-12)  therefore  imply  that 


h 


WC 


(gs  +  A  V) 

(rs  +  gs  +  A’/) 


hR  -  hb(xh) 


(2-23) 


ho  = 


(gs  4  Au) 
(ts  gs  A  "*■  A  u) 


hR  -  hwc  -  hb(xh) 


(2-24) 


The  above  discussion  only  addresses  the  geometric  relationships  that  could 

./-v 

allow  the  construction  of  hQ  estimates,  ho.  We  next  consider  how  some  of  the  specific 
quantities  required  in  the  above  relations  could  be  obtained  in  practice.  By  computing 
an  empirical  histogram  of  side-scan  sonar  intensity  image  values  over  a  given  strip,  it 
is  possible  to  determine  thresholds  x£b  so  that  e  percent  of  intensity  values  lie 
below  and  above  these  respective  values;  then  by  employing  x£h.  it  is  possible  to 
form  candidate  shadow  and  highlight  pixel  sets.  In  each  case,  these  sets  will  consist 
of  a  finite  number  of  connected  components  whose  number  can  be  reduced  by  a  size 
constraint,  i.e.,  requiring  each  connected  component  to  have  over  m  pixels.  Next,  by 
introducing  fuxxher  geometric  constraints  on  the  relative  location  of  highlight  blobs  and 
shadow  blobs,  a  candidate  set  of  pairs  (highlight  blob,  shadow  blob)  can  be 
determined.  Finally,  by  making  use  of  our  knowledge  of  the  sets  (highlight  blob, 
shadow  blob)  and  counting  pixels  in  the  cross-range  direction  corresponding  to  the 
center  of  each  highlight  blob,  and  knowing  the  slant-range  resolution  per  pixel,  it 
should  be  possible  to  determine  rs,  gs,  ^values. 

The  remaining  quantities  required  to  support  h0  calculation  consist  in  the 

ground  ranges  (xj,,  xs)  or  (xj,,  xs/l  xsu),  and  corresponding  bottom  height  values,  hb(*)- 

.  \ 

It  is  assumed  here  that  approximate  bottom  height  information,  hb  (•),  is  available  at 
the  same  resolution  of  the  side-scan  sonar  imagery,  through  the  application  of  some 
gridding  algorithm  to  discrete,  scattered  telesounder-derived  height  measurements.  In 
addition,  note  that  ground-range  information  is  assumed  to  be  obtainable  through  the 
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use  of  geometric  corrections  that  make  use  of  hb(*)-  For  example,  consider  a  reflector 
at  a  height  ha  above  the  bottom  and  at  a  slant  range  rsr;  then,  the  associated  estimated 

ground  range,  xr ,  can  be  viewed  as  the  solution  of  the  equation 

[hR  -  ha  -  hb(x)]2  +  x2  =  4  (2-25) 

In  the  application  of  (2-25),  hD  =  0  for  bottom  points  associated  with  shadow  endpoints 
identified  in  Figures  2-2  and  2-3.  In  the  case  of  an  object  in  the  water  column,  as 
depicted  in  Figure  2-3,  ha  could  be  set  to  zero  initially,  and  the  calculations  repeated  in 
an  iterated  manner,  using  better  estimates  for  ha  as  they  are  obtained. 

The  above  discussion  has  identified  a  concept  for  using  telesounder-derived, 
low-resolution,  three-dimensional  depth  map  information  to  estimate  object  diameter, 
hQ.  The  quantities  (rs,  gs,  /s)  required  in  the  above  analysis  might  be  expected  to  be 
accurate  to  within  a  few  slant-range  resolution  cells.  The  accuracy  of  ground  range 

and  hb(*)  values  are  coupled  through  (2-25).  In  addition,  the  accuracy  of  hb  (•)  values 
will  directly  affect  the  accuracy  of  h0  estimates  through  (2-14)  and  (2-16)  or  (2-19), 

(2-20),  (2-23),  respectively.  In  tum,  the  accuracy  of  hb  (*)  values  will  depend  on 

•  The  accuracy  of  discrete  telesounder-derived  height  measurements. 

•  The  interpolation  error  implicit  in  the  adopted  depth  map  gridding  algorithm. 

In  the  final  analysis,  the  utility  of  the  above  approach  may  be  best  evaluated  through 
attempting  to  implement  it  using  actual  data. 

3.0  High-Resolution  Three-Dimensional  Depth  Map  Utilization  Concepts 

The  discussion  that  follows  will  first  review  the  swath-bathymetry  concept  for 
constructing  high-resolution  depth  maps,  as  outlined  t>y  Denbigh  in  Ref.  rl].  Based  on 
Denbigh's  analysis,  figures  for  the  required  signal-to-interference  ratios  in  received 
signals  will  be  obtained  in  order  to  achieve  desired  depth  accuracies.  Next,  in 
Sections  3.1  and  3.2,  feature  calculations  based  on  stochastic  and  deterministic  depth 
surface  modeling  approaches,  respectively,  will  be  considered. 

The  telesounder-based  depth  measurement  concept  was  based  on  the  addition 
of  received  demodulated  signals  rl(»),  r2(*)  obtained  from  two  receiver*  as  depicted  in 

Figure  2-1.  The  factor  in  (2-4)  that  weights  the  magnitude  of  the  sum  signal  z(t),  is  a 
result  of  the  phase  difference  between  the  two  received  signals.  In  the  swath- 
bathymetry  depth  measurement  approach,  the  phase  difference  between  the  two 
received  signals  is  estimated  directly  from  the  individual  signals  and  employed  to 
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allow  depth  measurements  at  each  slant-range  resolution  cell.  For  the  sake  of 
consistency  with  Denbigh's  discussion  and  analysis,  the  geometry  of  the  two 
receivers  and  the  associated  depth  calculation  is  depicted  in  Figure  3-1. 

Hence,  from  Figure  3-1,  the  depth  below  receiver  #1,  hp(x),  may  be  calculated 
as 


ho(x)  =  rsi(x)  sin  (p  +  a(x))  (3-i) 

where  a(x)  is  related  to  the  phase  difference  between  the  two  received  signals,  0(x), 
by 


4>(x)  =  sjn  a(x) 

A, 


(3-21 


In  summary,  <p(x)  and  rsi(x)  are  estimated  directly  from  the  received  demodulated 
signals  r^*),  r2(«);  <J)(x)  is  then  used  to  calculate  a(x)  from  (3-2);  and  then  a(x), 
rsi(x)  are  used  to  form  hD(x)  from  (3-1). 

In  Ref.  [1],  Denbigh  investigates  the  question  of  determining  the  minimal 
signal-to-interference  ratio  necessary  to  guarantee  some  desired  fractional  depth 
accuracy.  If  Ahp,  A0  denote  depth  and  phase  errors,  respectively,  then  based  on  a 
linearized  analysis,  Denbigh  shows  that 


Ahp 

hp(x) 


A0 


_ 1 _ 

COS  (V(x)  -  p)  tan  (\j/(x)) 


(3-3) 


Based  on  a  phase  estimation  error  analysis  Denbigh  obtains  the  following  empirically 
fitted  result  for  the  standard  derivation  OAo  associated  with  the  phase  estimation 
error 


aA4>  - 


_2JL 

P-43 


(3-4) 


where  p  denotes  the  signal-to-interference  power  ratio  in  the  received  signal.  Hence, 
if 


|A$  =  Oao 


(3-5) 
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Figure  3-1.  Denbigh's  swath  bathymetry  depth  map  construction  geometry. 


Figure  3-1. 


0 


then  in  order  that 


Ahp 

hp(x) 


(3-6) 


we  must  require  that 


P  = 


_ 2J _ 

cos  (\y(x)  -  p)  tan  \y  (x) 


l/«43 


(3-7) 


We  next  apply  (3-6)  and  (3-7)  to  a  specific  case.  From  Table  2-2,  the  minimal 
diameter  of  a  cylindrical-shaped  mine  was  =  0.4  m,  and  hence,  we  let 


Ahp  =  (0.l)x(0.4m)  =  0.04  m 


then  if  hp  =  hR,  as  defined  by  (2-11),  then 


e  =  10'3 


Letting 


IP  =  0 


VM  =  S- 
4 

A  =  10  X 


and  employing  (3-7)  and  (3-9)  results  in  a  value  for  p  of 


(P)»  =  40.7 


(3-8) 

(3-9) 

(3-10) 

(3-11) 

(3-12) 

(3-13) 


The  result  (3-13)  illustrates  the  stringent  signal-to-interference  ratios  required  in 
order  to  obtain  depth  measurements  sufficiently  accurate  to  yield  shape  information 
concerning  cylindrical  objects  with  diameters  as  small  as  =  0.4  m.  Reference  [1] 
discusses  techniques  for  reducing  signal-to-interference  ratios.  One  approach  is  to 
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average  phase  measurements  over  some  collection  of  slant- range  and  along- track 
side-scan  sonar  image  cells  in  order  to  reduce  values.  By  averaging  N  phase 
values,  a  reduction  of  1/Vn~  may  be  achieved  in  a^,  and  hence,  for  example,  by  letting 

N  =  100  (3-14) 


the  figure  in  (3-13)  becomes 


(p)cfi  =  17.4  dB  (3-15) 

Unfor?  lately,  averaging  phase  information  results  in  a  loss  of  slant-range  and  along- 
track  resolution  in  the  resulting  depth  map.  In  the  above  case,  to  ensure  a  minimum  of 
5  resolution  cells  associated  with  a  0.4-m  diameter  cylindrical  cross  section  requires  a 
slant-range  resolution  of  =8  cm  or  3.1  in. 

In  summary,  the  above  discussion  suggests  that  slant-range  resolutions  of  8 
cm  and  signal- to-interference  ratios  of  as  high  as  40.7  dB  may  be  necessary  to 
adequately  characterize  cylindrically-shaped  mines  as  small  as  0.4  m  in  diameter. 

3.1  Feature  Calculation  Based  on  Stochastic  Depth  Surface  Modeling 

The  discussion  that  follows  adopts  the  assumption  that  the  two-dimensional 
terrain  bottom  height,  hb(x,y),  pictured  as  the  two-dimensional  extension  of  the  terrain 
in  Figure  3-1  and  its  discrete  grid  counterpart,  hb(k,l),  is  modeled  as  a  random  field. 
In  this  context,  a  bottom  mine  will  appear  as  an  "anomalous"  patch  of  terrain  whose 
statistical  characteristics  are  inconsistent  or  unpredictable  relative  to  the  surrounding 
"background"  terrain.  Hence,  the  feature  calculation  approaches  described  here  are 
motivated  by  the  literature  on  texture  measures  (Refs.  [3],  [4],  [5])  as  well  as 
random  field-based  approaches  to  image  processing  (Ref.  [6]). 

The  statistics  of  a  stationary  random  field  are  characterized  in  the  spatial 
domain  by  its  mean  value  and  correlation  function,  or  in  the  spectral  domain  by  its 
power  spectral  density  function.  Hence,  typical  choices  for  features  are  related  to 
empirical  correlation  function  or  power  spectral  densities.  Empirical  autocorrelation 
function  values,  c(Ax,Ay),  computed  based  on  field  values  hb(k,  1)  associated  with  a 
specified  two-dimensional,  processing  "estimation"  window.  Re,  are  defined  as 
follows: 


12 


Let  empirical  mean  and  standard  deviation  estimates,  in,  o2  be  defined  by 


m  =  j^j  X  hb(M) 

|K«(k,i)6RB  (3-16) 


s2  =  j-J-j-  X  (hb  (k,  l)  -  m)2 

|KEl(k.i)eRB  (3-17) 


where  |ReI  denotes  the  number  of  elements  in  Re,  then 


c(Ax,  Ay)=  — - L-  £  (hb  (k  +  Ax,  1  +  Ay)  -  m)  (hb  (k,  1)  -  m) 

N(AX,  Ay)  CT  ((1c,  1)  €  R  E  I 

|&(k+  1  +  Ay)€RE| 

(3-18) 


where  N(AX,  Ay)  denotes  the  number  of  terms  in  the  summation  defined  by  (3-18).  A 
vector  of  two-dimensional  correlation  coefficient  features  can  be  obtained  by  varying 
(Ax,  Ay)  e  R^n),  where  R^n)  denotes  a  region  in  the  2D  correlation  plane  of  the  form 
depicted  in  Figure  3-2.  Due  to  the  symmetry  of  the  autocorrelation  function  through 
the  origin,  knowledge  of  c(Ax,  Ay)  for  (Ax,  Ay)  e  R^n)  implies  knowledge  of 
autocorrelation  function  estimates  over  the  region  [-n,  n]  x  [-n,  n]. 

Next,  the  construction  of  power  spectral  density  related  features  is  described. 
Assuming  for  simplicity  that  Re  =  [0,  Ne*1]  x  [0,  Ne  -1],  the  discrete  Fourier 
transform  Hb(u,v)  associated  with  the  zero-mean  field,  hb(»,  •)  -m  is  defined  by 

NE-i 

Hb(u,  v)  =  ]T  [hb(k,  1)- m]  e-‘2,l(ku  +  lv)  u,v,  e  [0,  NE -l] 

Ne  k,  i  =  o  (3-19) 


Standard  normalized  texture  features  based  on  ring  and  wedge-shaped  samples  of  the 
discrete  Fourier  power  spectrum  are  defined  as  follows 

<k,.r2=  XlHb(u>  vf 

|  r2  <  u2  +  v2  <  r^j 

1  u,  v  e  [0,  Ne  -l]|  (3-20) 
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(n,0) 


(*n,0) 


Figure  3-2.  region  in  2D  correlation  plane. 


(3-21) 


V0i.02=  ^  JtHb(u,vf 
cr 

1 01  <  tan'1  (v/u)<  02 1 
\  u,  v  e  [0,  Ne  -l]  } 

By  choosing  a  discrete  collection  of  (rj,  r2),  (0i,  62)  pairs,  a  vector  of  power  spectral 

density  related  features  may  be  defined. 

The  features  defined  by  (3-18),  (3-20)  and  (3-21),  together  with  a2,  use 
nonparametric  techniques  to  characterize  the  statistics  in  the  spatial  and  spectral 
domains,  respectively,  of  the  zero  mean  random  field  determined  by  [hb  (•,  •)  -  m].  An 
alternative  approach  for  constructing  statistical  features  would  be  to  adopt  the 
assumption  that  hb(*,  •)  can  be  modeled  by  a  parametric,  2D  autoregressive  (ar) 

model  of  the  form 


hb(k,  1)  -  J  On,m  hb  (k  +  n,  1  +  m)  =  wk,  1  +  pw 

(ivnJeS^11*) 


(3-22) 


where  denotes  a  2D  region  pictured  in  Figure  3-3,  wk,i  denotes  a  zero-mean 
spatially  independent  random  field  with  variance 

E(w£i)  =  °w  (3-23) 

and  pw  denotes  a  nonzero  mean  component  to  the  prediction  error  process  defined  by 
[pw  +  wk  [].  In  the  above  context,  estimated  2D  model  parameters 

(a1, 6cnim(n,m)6  S(n*>)  could  be  identified  as  features  that  determine  the  spatial  or 
spectral  properties  of  the  zero-mean  random  field  [htiK  •)  -  m].  Estimates  of  {«n.m»  Pw) 
are  determined  through  the  solution  of  a  linear  least-squares  problem  defined  by 
optimizing  Ols.  where 

OLS=lfhb(k,D-  J  an,mhb(k  +  n,l-t-m)-pwY 
\  (umleSW  / 

(k,  1)  e  Re  I 

so  that  (k  +  n,  1  +  m)  e  Re  / 

V(n,m)eS^  )  (3-24) 
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Figure  3-3.  Support,  S^na\  of  2D  autoregressive  model. 


If  itils  denotes  the  number  of  terms  in  the  summation  defined  by,  then 


o2w  =  — ^ —  Ols 

nxs 


{On,m,fw) 


(3-25) 


Refs.  [7]  and  [8]  discuss  solution  techniques  for  linear  least-squares 
problems.  The  performance  functional  G>ls  can  be  expressed 

Ols  =  1 1  Ax  -  b  |  p  (3-26) 

for  an  appropriately  defined  oils  x  n^s  -  dimensional  matrix.  A,  itils  -  dimensional 
vector  b,  and  nLs  -  dimensional  vector  x  defined  as 

A  (aiynJn.mJeS^H 

\  Hw  7  (3-27) 


when  nLs  is  expressed  as 


nLs  =  2n|  +  2na  +  1  (3-28) 

The  most  computationally  inexpensive  technique  for  optimizing  <I>ls  involves  the 
solution  of  the  normal  equations 


[A'  A]  x  =  A'  b 


(3-29) 


and  requires  a  total  of 


(n£s/2)  [mLs  +  (iils/3)]  (3-30) 

floating-point  operations  using  the  Cholesky  decomposition  to  solve  (3-29),  by 
factorizing  A  A  into  a  product  of  upper  and  lower  triangular  factors,  and  solving  a 
sequence  of  triangular  systems. 

While  the  estimated  ar  model  parameters  {ccn,m,  ai  )  can  be  employed  as 
features,  an  additional  option  employed  in  random  field- based  approaches  to  object 
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detection  (Ref.  [6])  involves  the  construction  of  normalized,  prediction  error  residuals 
defined  by 


rk,  1 A 


l"hb(k,l)-  £  Sn, m hb (k  +  n,  1  +  m) -  jlv 
[_  (nirOeSt1') 


/ol 


(3-31) 


When  the  random  field  model  defined  by  (3-22)  holds,  the  normalized  prediction  error 
residuals  defined  by  (3-31)  will  approximately  represent  a  spatially  independent 
random  field.  In  addition,  large  r^i  values  will  be  representative  of  areas  of  the  field 
which  are  atypical.  The  above  discussion  suggests  that  correlation  coefficient,  and 
mean  square  power  estimates  constructed  using  r^j  values  over  processing  sub¬ 
windows  within  Rg,  will  be  useful  features. 

The  above  described,  stochastic  model  based  approaches  to  feature 
~ or  "traction  were  based  on  viewing  hb(*,  •)  as  a  continuous  valued  random  field.  The 
next  texture  measure  related  features  described  here  are  defined  using  grey-level 
images,  i.e.,  images  that  assume  a  discrete  set  of  values.  Such  grey-level  images 
arise  naturally  in  an  imaging  setting,  but  may  be  formed  by  quantizing  a  continuous 
valued  image  into  a  discrete  set  of  ranges,  or  bins.  It  is  assumed  here  that  the  image 
values  [hb  (k,  1)  -  m]  are  mapped  into  the  values  0  .  .  .  nc-1  by  an  unspecified 
quantization  function  Gq[»],  in  order  to  obtain  the  discrete  field,  f(k,  1),  i.e. 

f(k,  1)  =  Gq  [hb  (k,  1)  -  m]  (3-32) 


The  first  grey-level  statistic-related  texture  measures  to  be  defined  are 
computed  from  empirical  joint  grey-level  distribution  estimates.  Let  A  denote  a  two- 
dimensional  discrete  shift  vector  defined  by 

^  =  (Ax,  Ay)  (3-33) 


and  let 


Pa  (i,j)  =  Pr  ( (f(k,  1)  =  i)  &  (f (k  +  Ax,  1  +  Ay)  =  j))  (3.34) 

Va(iJ)  =  [Pa  (i,j)  +  Pa  0. 0]  H  (3-35) 
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If  9  a  (iJ)  denotes  an  empirically  computed  estimate  of  Va  (i»  j),  then  Ref.  [3]  defines 
the  following  spatial  gTey-level  dependence  (SGLD)  related  features: 


(1)  Energy: 

Ea=  X  (9a  (id))2 

(2)  Entropy: 

ha=  X  '  9a  0»  j)  In  [9a  (i»  j)] 

(3)  Correlation: 

Ca  =  Z  [0  ’  £x) (j  -  £y) 9a  (i .j) / (Sx  oy)] 

(4)  Local  Homogeneity: 

la=  X  —  j.  9a  Ut  j) 

1  +(i-jr 

(5)  Inertia: 

lA=I  (i-jfVA(i.j) 

with  {fix,  Hy,  cjx,  oy}  defined  by 

?x  =  X  1  9a  (i,  j) 

£y  =  X  J’9a  Oj) 

<*x  =  X  (i-?xF9A  (i, j) 

Sy=  X  (j  ‘  fyf  9a  (i,j) 


(3-36) 

(3-37) 

(3-38) 

(3-39) 

(3-40) 

(3-41) 

(3-42) 

(3-43) 

(3-44) 


The  features  (Ea,  Ha,  Ca,  La,  Ia)  should  be  computed  for  a  collection  of  2D  shifts  over 
a  region  of  the  form  defined  by  Figure  3-2,  due  to  the  symmetry  of  Ya(v)  through  the 
origin  in  A. 
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A  second  set  of  grey-level  statistic-related  texture  measures  are  computed 
from  the  empirical  distributions  of  the  absolute  value  of  grey-level  differences.  Let 
riA(i)  be  defined  by 


TU  (0  =  Pr  ik  fe.  2)  -  f  (k  +  Ax,  i  +  Ay  J  =  i)  (3_45) 

and  let  T]&  (i)  denote  a  corresponding  empirical  estimate.  Then  Ref.  [3]  defines  the 
following  grey-level  difference  (GLD)-related  features: 

(1)  Contrast: 

CONi=  X  i2»U(i) 

(2)  Angular  Second  Moment: 

ASMa=  £  (ru(i))2 

(3)  Entropy: 

ENTa  =  -  X  fla  (i)  ln[ffa  (i)] 

(4)  Mean: 

ME  AN  4  =X  i  A  (0 

(5)  Inverse  Difference  Moment: 

Note  from  the  definition  of  Tj^Ci)  that  the  features  computed  will  be  dependent  on  the 
sign  of  the  direction,  A,  and  hence  {CON^,  ASM^,  ENT&,  MEAN^,  IDM^}  would 
typically  be  computed  for  a  collection  of  A's  over  a  neighborhood  of  the  origin,  from 
which  (0,  0)  was  excluded. 

A  third  and  final  set  of  grey-level  statistic-related  texture  measures  are 
computed  by  counting  grey-level  "runs"  of  various  lengths.  A  "run”  of  grey  levels 
consists  in  a  collection  of  adjacent  pixels,  along  some  direction  9,  having  the  same 
grey  level  value.  Let  ^0  (i,  j)  denote  the  number  of  runs  of  length  j,  associated  with 


(3-46) 


(3-47) 


(3-48) 


(3-49) 


(3-50) 
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direction  9  and  grey  level  value  i,  contained  within  the  processing  window  Re.  Then 
Ref.  [3]  defines  the  following  grey-level  run  length  (GLRL)-related  features: 


(1)  Short-Run  Emphasis: 

RFi.e  4^-Z 

tr  f 


(2)  Long-Run  Emphasis: 

J2  ^bOJ) 

Tr 


(3)  Grey-Level  Distribution: 

RF3.8  4=!-X(s^(i.j)] 
T«  i  I  j  I 


(4)  Run-Length  Distribution: 

R[xi>£  J-Z  (£««>) 

Tr  j  \  i  1 


(5)  Run  Percentages: 


RFs.e^EW.j) 


where 


(3-51) 


(3-52) 


(3-53) 


(3-54) 


(3-55) 


(3-56) 


The  features  {RFi,e,  RF2,q  RF3(e,  RF4,e  RFs.e)  are  typically  computed  for  directions  9 
=  0%  45%  90°,  135*. 

Research  in  Refs.  [3],  [4]  has  investigated  the  use  of  SGLD,  GLD,  GLRL 
features  in  discriminating  between  textures  in  aerial  terrain  photo  and  synthetic  fields 
generated  using  Markov  chain  models.  These  results  suggest  that  SGLD  and  GLD- 
derived  features  were  more  effective  in  separating  different  texture  classes  than 
GLRL  features.  Hence,  of  the  grey-level  texture  measures  discussed  above,  the 
SGLD  and  GLD  features  may  be  the  most  promising. 
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The  above  discussion  has  centered  on  traditional  spatial  and  spectral  texture 
measures.  Recent  mathematical  research  under  the  heading  of  fractal  geometry  has 
resulted  in  new  geometric  concepts  for  describing  and  quantifying  the  degree  of 
irregularity  of  natural  objects.  Hence,  an  additional  feature  of  interest  may  be 
estimates  of  the  fractal  dimension  of  the  3D  terrain  surface  over  a  processing  window, 
Re-  Fractal  dimension  features  have  already  been  successfully  employed  in  Ref.  [9]  to 
discriminate  natural  from  man-made  objects  in  images.  The  brief  discussion  that 
follows  outlines  the  use  of  Mandelbrot  measures  in  fractal  dimension  estimation  as 
described  in  Ref.  [  10].  Let  a  3D  point  set  \  be  defined  by 

/  xm 

ric,i=  VM  (k,  l)  e  Re 

\  rib  (k,l)  (3-57) 


Then,  by  centering  spheres  of  radius,  L,  at  each  of  the  v^i  and  counting  the  number  of 
3D  terrain  points  that  are  enclosed,  the  "Mandelbrot”  probability  distribution  P(m,L) 
may  be  estimated,  where 

P(m,  L)  =  the  probability  that  a  sphere  of  diameter  L  encloses  m-points  (3-58) 
Now,  by  computing  M(L)  defined  as 

M(L)^£  mP(m,L)  (3-59) 


and  repeating  the  above  procedure  for  a  sequence  of  L  values,  the  fractal  dimension 
estimate  may  be  obtained  as  the  slope,  D,  of  the  graph  of  ln(M(L))  versus  ln(L),  i.e., 
through  solving  a  linear  least-squares  problem  involving  the  minimization  of 


XDMMOJ-DIndJ]2 


(3-60) 


Next,  the  present  subsection  on  stochastic  depth  surface  model-based  feature 
construction  techniques  is  concluded  by  sketching  the  potential  use  of  wavelet 
transform-based  features.  Recall  that  through  (3-9)  -  (3-21),  2D  Fourier  spectra 
related  features  were  defined.  The  basis  functions  associated  with  Fourier 
expansions,  i.e.,  complex  exponentials,  are  localized  in  their  frequency  content,  but  not 
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in  terms  of  their  spatial  duration.  Wavelet  signal  representation  techniques  (see  Ref. 
[11])  allow  for  expansion  in  terms  of  basis  functions  that  are  both  orthogonal  and 
localized  in  terms  of  both  spatial  duration  and  frequency  content. 

In  the  discussion  that  follows,  the  wavelet  signal  representation  of  a  2D 
continuous  function  g(x,y)  is  briefly  described,  in  order  to  motivate  the  construction  of 
wavelet  transform  related  features.  Two  effectively  finite  duration  ID  functions  that 
play  a  key  role  in  wavelet  based  signal  representatives  are  termed  the  "scaling 
function",  <})(•),  and  "wavelet  function",  y(»).  The  following  orthonormal  2D  functions 
are  defined,  and  associated  with  2D  basis  functions  underlying  the  wavelet  signal 
representation: 


(*,  y)  =  2j  4>(2j  (x  -  2'j  n))  $  ( (2j  (y  -  2~j  m)) 
(x,  y)  =  2j  <D(2>  (x  -  2'j  n))  y  ( (2J  (y  -  2 J  m)) 
(x,  y)  =  2J  \p^2j  (x  -  2‘j  n))  <>  ( (2j  (y  -  2'J  m)) 


(x,  y)  =  2j  \|/{2J  (x  -  2  j  n))  \j/  ( [l}  (y  -  V  m))  (3-64) 

The  index  "j"  in  the  above  notation  denotes  the  j-th  resolution,  while  the  indices  (n,  m) 
serve  to  determine  the  regions  over  which  the  basis  functions  are  effectively  non-zero. 
Using  the  notations  {•},  H^u)  {•),  {•},  H<3-j)  {•}  to  denote  projection 

operators  mapping  a  given  function  g(»,  •)  onto  the  linear  spaces  spanned  by  the 
associated  orthonormal  basis  functions,  then  the  key  practical  result  underlying  the 
use  of  wavelet  expansions  can  be  expressed  as 

H<a0>{g}  =  HM{g}+  £  [H(^){g}  +  HtzJ){g}  +  H<3-j){g}] 

J  =  J  (3-65) 

Due  to  the  special  properties  of  <)>(•),  \j/(*),  each  term  on  the  right-hand  side  of  (3-65) 
is  mutually  orthogonal  and  is  itself  represented  in  terms  of  orthogonal  basis  functions. 
The  { g }  term  represents  a  coarse,  low  frequency  approximation  to  the  function. 
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while  the  Hri.j)  { g } ,  H(2J)  (g},  { g }  terms  represent  higher  spatial  frequency 

effects  that  appear  at  progressively  finer  resolutions,  and  correspond  to  effects 
resembling  vertical  "edges,"  horizontal  "edges,"  and  "comers,"  respectively. 

When  the  above  2D  signal  representation  framework  is  applied  to  the 
representation  of  discrete  images  as  in  Refs.  [11]  and  [12],  the  coefficients  associated 
with  the  expansion  of  each  of  the  projections  on  either  side  of  (3-65)  can  be  associated 
with  "images."  These  coefficients  can  then  be  used  as  features  for  the  detection  of 
"anomalous"  objects,  which  appear  at  some  specified  resolution.  Hence,  one  feature 
of  potential  interest  would  be  ms  power  estimates  associated  with  sub-windows  of 
the  "images"  associated  with  each  of  the  Hrio)  (•},  H(2*i)  {•},  {•}  terms. 

3.2  Feature  Calculation  Based  on  Deterministic  Depth  Surface  Modeling 

The  preceding  discussion  adopted  a  statistical  perspective  on  the 
representation  of  the  terrain  bottom  height  hb(x,y),  and  the  construction  of  features  for 
mine  classification.  The  discussion  that  follows  will  adopt  a  deterministic  surface 
model-based  perspective,  similar  to  tha*  adopted  in  the  3D  object  recognition  and 
machine/computer  vision  literature.  A  comprehensive  review  of  3D  object  recognition 
approaches  by  Besl  and  Jain  (Ref.  [13])  suggests  that  object  recognition  is  best 
solved  by  recognizing  the  individual  surface  regions  associated  with  the  depth  map 
functions  for  specific  object  types.  Hence,  the  structure  of  3D  object  recognition 
algorithms  often  consists  of 

(1)  The  fitting  of  surface  representations  and  extraction  of  geometric 
features. 

(2)  The  construction  of  a  symbolic  representation  of  the  collection  of 
surfaces  defining  a  depth  map,  using  the  features  extracted  in  (1). 

(3)  The  "matching"  of  derived  symbolic  representations  against  analogous 
symbolic  representations  computed  for  specific  object  types. 

Our  discussion  will  focus  on  step  (1),  with  the  view  that  the  geometric  features 
identified  will  become  inputs  to  neural-network-based  classifiers  for  mine  recognition. 

The  development  that  follows  will  first  motivate  the  selection  of  curvature  and 
surface  normal-related  features  by  appealing  to  the  differential  geometry  of  surfaces. 
Next,  specific  generic  surface  fitting/representation  approaches  will  be  reviewed  that 
can  support  the  calculation  of  the  identified  geometric  features.  Then,  noting  that  the 
above  described  approaches  employ  general  surface  representation  techniques,  while 
mines  frequently  have  spherical  or  cylindrical  surfaces;  we  will  finally  consider  the  use 
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of  specialized  quadric  surface  fitting  techniques  for  the  calculation  of  shape-related 
features. 

We  next  define  some  concepts  and  notation  from  the  differential  geometry  of 
surfaces  (Ref.  [14]).  In  general,  a  parametric  3D  surface  is  represented  by  a  vector 
valued  function  of  two  variables,  f(u),  where 


f(u)£ 


/fi(u)\ 

fciuj 

\  f3  (u)  / 


(3-66) 


with  fi(u),  f2(u),  f3(u)  representing  x,  y,  z  coordinate  functions,  respectively  and 


u4 


(3-67) 


In  the  case  of  a  Monge  or  graph  surface,  (3-66)  is  specialized  so  that 


fi(u)=ui  (3-68) 

f2(u)  =  u2  (3-69) 


If  the  differential  of  a  vector  valued  mapping,  f(u),  denoted  by  df,  is  defined  as 

df  4  fUl  dui  +  fU2  du2  (3-70) 

then  the  first  and  second  fundamental  forms  are  defined  by 

I  (u,  du)  4 -df  •  df  =  du' G  du  (3-71) 

II  (u,  du)  4 -df  •  dn  =  du’ B  du  (3-72) 


where  n  denotes  the  surface  normal  field  defined  by 


n(u)  =  fu'*^- 


I  |fui  X  fuJ  I 


(3-73) 
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and  G,  B  denote  first  and  second  fundamental  form  matrices  that  are  functions  of  u. 
The  normal  curvature,  kn,  denotes  the  projection  onto  the  normal  n(u)  of  the  curvature 
vector  associated  with  a  curve  on  the  3D  surface,  passing  through  f(u),  and  is 
expressed  by 


^  _  n(u,du) 

I  (u,  du)  (3-74) 

The  extreme  values  associated  with  kn,  denoted  by  ki,  k2,  are  termed  principal 
curvatures.  The  Gaussian  and  mean  curvatures,  K  and  H,  respectively,  are  defined  by 


K  =  ki  k2 


(3-75) 


H  =  ^(k1+k2) 


(3-76) 


and  can  in  addition  be  expressed  as  functions  of  first  and  second  fundamental  form 
matrices  as 


K  =  detlcr1  b] 

(3-77) 

H  =  1-  trfG'1  b] 

2 

(3-78) 

Besl  and  Jain  in  Ref.  [15]  summarize  the  properties  that  make  H,K  attractive 
features  for  use  in  a  3D  object  recognition  context: 

( 1 )  H  and  K  are  invariant  to  changes  of  variables  in  the  u  parameters,  with 
the  only  exception  that  sign  changes  in  H  may  occur  if  the  sense  of  the 
surface  normal  n(u)  changes. 

(2)  H  and  K  are  invariant  to  translations  and  rotations  of  the  3D  surface 

(3)  The  signs  of  H,  K  indicate  the  local  shape  of  the  surface. 

In  Refs.  [15]  and  [16]  Besl  and  Jain  indicate  the  following  eightfold  classification  of 
local  surface  shape  based  on  the  signs  of  H,  K: 

(1)  H  <  0,  K  >  0  peaked  surface 

(2)  H  <  0,  K  =  0  -*■  ridge  surface 

(3)  H  <  0,  K  <  0  -*■  saddle  ridge  surface 

(4)  H  =  0,  K  =  0  “*•  flat  surface 
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(5)  H  =  0,  K  <  0  -+•  minimal  surface 

(6)  H  >  0,  K  <  0  -*•  saddle  valley  surface 

(7)  H>0,  K  =  0  valley  surface 

(8)  H  >  0,  K  >  0  *-*■  pit  surface 

The  eight  categories  mentioned  above  are  depicted  in  Figure  3-4. 

Yang  and  Kak  in  Ref.  [17]  advocate  the  use  of  H,K  histograms  in  order  to 
discriminate  between  3D  objects.  These  authors  distinguish  between  the  following 
six  categories  of  curvature  histograms: 

( 1 )  Positive  only  with  peak:  curvature  values  nearly  all  positive  with  a  well- 
defined  peak  value  in  the  curvature  histogram. 

(2)  Positive  only  without  peak:  curvature  is  positive,  but  continuously 
varying,  so  that  no  well-defined  peak  occurs  in  the  curvature  histogram. 

(3)  Negative  only  with  peak:  same  as  (1)  except  involving  negative  values. 

(4)  Negative  only  without  peak:  same  as  (2)  except  involving  negative 
values. 

(5)  Peak  at  zero:  curvature  histogram  has  a  single  peak  at  the  bin 
corresponding  to  zero  curvature. 

(6)  Positive  and  negative  without  peak:  corresponds  to  a  surface  with 
continuously  varying  curvature. 

As  specific  examples,  a  spherical  object  has  a  K-histogram  in  category  (1),  and  an  H- 
histogram  in  category  (3),  while  a  cylindrical  object  has  a  K-histogram  in  category  (5) 
and  an  H-histogram  in  category  (3). 

While  specific  approaches  for  calculating  H,  K  values  will  be  made  explicit  in 
the  discussion  that  follows,  the  above  background  is  sufficient  to  suggest  the  following 
curvature  related  features  for  use  in  neural  network  based  mine  classification.  It  is 
assumed  here  that  the  quantities  designated  below  are  obtained  by  the  calculation  of 
H(u),  K(u),  values  over  a  discrete  collection  of  u  values,  which  are  associated  with  a 
discrete  grid  defined  over  a  specified  processing,  or  estimation  window,  Rg.  The 
proposed  list  of  curvature  related  features  includes 

( 1 )  Minimum  and  maximum  H,  K  values. 

(2)  Average  H,  K  values. 

(3)  The  variances  of  H,  K  values. 

(4)  H,  K  histogram  values. 

Next,  more  explicit  results  than  (3-75)  -  (3-78),  obtained  from  Ref.  [16],  are 
stated  for  the  calculation  of  H,  K.  If  a,  i=l,  2,  3  is  defined  by 
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Figure  3-4.  Local  surface  categories  based  on  H,K  signs. 
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al  =  [fui  X  fllj  *  fuiUl  (3-79) 

a2  =  [fui  xfuj*fu,u2  (3-80) 

a3  =  [fui  x  fU23  *  W2  (3-81) 


then 


H  _  ai  |  |fuj  P  -  2  a2  [fu,  *  fu,]  +  a3 1  {fUl|  |2 

2  I  |fm  x  fuJ  P  (3-82) 

K  =  (ai  a3  ~  a2) 

II^UJ  X  fuJI4  (3-83) 


The  above  general  results  simplify  in  the  case  of  a  Monge  or  graph  surface  ((3-68)  and 
(3-69))  to 


H  =  1 
2 


(f3,  Ui  Ui  +  £3  .U2U2  +  f3  ,uiui  ff.i n  +  f3 

,u2u2ff,ui  -  2f3  ,U1  f3,UiU2) 


(1  +  f3,  Ui  +  f3, 


(3-84) 


_  (f3,  Ul  Ui  f3,  u2u2  '  fj,  UiuJ 


(3-85) 


The  important  conclusion  to  be  obtained  from  (3-82)  -  (3-85)  is  that  the  calculation  of 
H,  K  requires  the  calculation  of  second  order  derivatives  of  the  coordinate  functions, 
fi(u). 

The  above  discussion  has  centered  on  the  definition  of  curvature-related 
quantities  for  use  as  features  in  3D  object  recognition.  Other  research  in  the  3D 
object  recognition  literature  has  considered  the  use  of  surface  normal  orientation 
related  information.  The  unit  vector  surface  normal  field,  n(u),  defined  by  (3-73)  can 
be  characterized  by  its  associated  polar  and  azimuthal  spherical  coordinates,  denoted 
here  by  <}>(u),  0(u),  respectively.  Sethi  and  Jayaramamurthy  (Ref.  [18]),  in  the  case  of 
graph  or  Monge  surfaces,  define  characteristic  contains  as  curves  Cp  satisfying 
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Cp=  (u  :  n(u)»  nref  =  p}  (3-86) 

The  vector  nref  denotes  a  fixed,  reference  direction  unit  vector.  As  specific  examples, 
note  that  the  Cp  contours  associated  with  a  sphere  are  a  family  of  circles  or  ellipses, 
while  for  a  cylinder,  the  analogous  contours  are  a  collection  of  parallel  lines.  A  Hough 
transform  based  approach  is  suggested  in  Ref.  [18]  for  the  characterization  of  Cp 
contours.  Letting  the  function  gr><p  (u)  be  defined  as 


gr,<p{«)=  r-ui  costp-  u2  sintp  (3-87) 

letting  (ri,  <pj)  denote  a  discrete  collection  of  (r,  <p)  values  covering  alternative  lines  of 
practical  interest,  then  the  Hough  Transform  "image"  H(r*,  <pj)  is  determined  (see  Ref. 
[19])  by  initializing  H  (r,,  <pj)  to  zero  and  then  incrementing  by  1  for  each  discrete  uD  e 
Cp  satisfying 


(3-88) 


where  £  is  selected  based  on  the  resolution  of  the  (r,cp)  discretization.  In  the  above 
context,  features  of  potential  use  as  inputs  to  a  neural-network  classifier  include 

( 1 )  H  (rj,  tpj)  "image"  values. 

(2)  (r,  (p)  moments  computed  viewing  H  (rj,  cpj)  as  a  discrete  mass 
distribution,  e.g.,  such  as  the  (r,  tp)  center  of  mass,  and  second 
moments  of  (r,  tp)  computed  about  the  r,  cp  mass  centers. 

While  the  technique  of  Sethi  and  Jayaramamurthy  (Ref.  [18])  uses  surface 
normal  information  to  identify  the  type  of  characteristic  contours  Cp  in  the  (ui,  U2) 
plane,  and  hence  to  identify  object  type,  an  approach  proposed  by  Herbert  and  Pence  in 
Ref.  [20]  suggests  the  direct  use  of  the  Hough  transform  approach  on  calculated 
surface  normal  components  in  order  to  test  for  relationships  satisfied  for  alternative 
surface  types.  As  specific  examples,  consider  the  following  Hough  transform  function, 

go»L0*  (ni’  n2>  n3),g£YeV  (ni,  n2,  n3)  which  play  the  same  role  as  gr,<p(u)  in  (3-87),  in 
defining  Hough  transform  "images"  H^fy)*,  0*)*  H^CYL^<t>i  ,  0j  ) : 


g0*L0* (nt, n2,  n3)  =  sine))*  cos  0*  ni  +  sine))*  sin  0*  n2  +  cos<>*  n3  -  1  (3.89) 


gp*y  (ni,  n2,  n3)  =  sin  <j>*  cos  0*  nj  +  sin  <j>*  sin  0*  n2  +  cos  4>*  n3  -  1  (3.90) 
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Note,  that  (3-89)  and  (3-90)  specify  quantities  that  are  exactly  zero  for  surface  normal 
components  associated  with  planar  and  cylindrical  surfaces,  respectively,  when  (<))*, 
0*)  denotes  the  spherical  coordinates  of  the  planar  surface  normal  or  the  axis  of  the 
cylinder,  respectively.  The  above  discussion  suggests  that  Hough  transform-based 
features,  analogous  to  those  described  in  the  case  of  characteristic  contours,  Cp  ,  but 
based  on  generalized  Hough  transforms  computed  using  (3-89)  and  (3-90),  may  be  of 
interest. 

The  approaches  presented  in  Refs.  [18]  and  [20]  make  use  of  n(u)  vector  field 
alone.  Horn  (Refs.  [21]  and  [22])  suggests  the  use  of  derived  image  termed  the 
"extended  Gaussian  image"  (EGI),  which  combines  both  surface  normal  and 
curvature-related  information.  In  the  case  of  convex  objects,  there  exists  a  unique 
correspondence  between  surface  points  and  associated  surface  normal  directions. 
Hence,  for  a  convex  object,  the  continuous  EGI,  which  may  be  viewed  as  a  surface 
mass  density  defined  on  the  unit  sphere,  G(CONV\^0)  is  defined  by 


o<CONVU(u),e(u))=  -J- 

K(u) 


(3-91) 


For  nonconvex  objects,  with  a  finite  or  countable  number  of  points  having  the  same 
orientation,  the  continuous  EGI  is  defined  by 


G{0,0) 


y  _i_ 

A|K(UiJ 

{1:  4>(u)  =  0,0(10  =  0} 


(3-92) 


For  objects  with  an  uncountable  number  or  points  having  the  same  orientation,  the 
continuous  EGI  involves  impulses  defined  on  the  surface  of  the  unit  sphere  (see  Ref. 
[21])  for  a  precise  definition  in  this  case).  For  our  purposes,  (3-92)  is  an  adequate 
definition  since  our  main  interest  is  in  a  discrete  version  of  the  EGI,  termed  the  DEGI. 
The  discrete  EGI,  or  DEGI,  is  determined  by 

•  A  discrete  set  of  cells  ap  p  =  1  . . .  Ms  defined  on  the  unit  sphere. 

•  A  discrete  set  of  rectangular  cells  in  the  u  parameter  space,  with 
associated  center  ug,  and  areas  Ag  £  =  1  . . .  Mp. 


so  that 
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(£  (4>  (U£),  0  (uj  6  Op) 


(3-93) 


Alternatively,  since  the  Gaussian  curvature  can  be  formulated  in  terms  of  area-related 
quantities,  GCp  can  also  be  approximated  as 


GOp  =  X  A£  V Gn  (tig)  G22  (ug)  -  G?2  (ug) 

{£  (<|>(iie),e(iie))eq>} 


(3-94) 


where  G  (u)  denotes  the  first  fundamental  form  matrix  defined  through  (3-71).  Note, 
that  for  the  case  of  a  Monge  or  graph  surface 


V  G11  (ug)  G22  (ug)  -  G12  (ug)  =  V 1  +  f|,  Ul  +  fl.  u2 


(3-95) 


The  use  of  the  DEGI  in  a  3D  object  recognition  context  is  complicated  by  the 
fact  that,  in  general,  it  is  viewpoint-dependent,  and  that  typically  only  the  visible 
portion  of  a  surface  is  available  for  its  computation.  One  way  to  derive  features  from 
the  DEGI  whose  viewpoint  dependence  is  lessened  is  to  regard  GCp  as  defining  a 

discrete  mass  distribution  on  the  unit  sphere  and  to  compute  an  appropriate  center  of 
mass,  with  moments  defined  based  on  expected  powers  of  spherical  distances  from 
the  mass  center.  It  is  assumed  that  GCp  is  viewed  as  a  point  mass  associated  with  an 

appropriately  selected  center  (0p,  0p)  of  the  spherical  cell  ap.  Then,  the  spherical 
coordinates  of  a  center  of  mass,  (0Cm»  9cm)  can  be  defined  as 

Z  4>pGoP 

Z  G°p  (3-96) 

X  ®p  GOp 

Z  G°P  (3-97) 


0cm 


®cm 


Hence,  if  ds  (<|>i,  0i,  02-92)  denotes  the  great  circle  arclength  between  the  two  points 
(0i»  9i).  (02.  62)  then  expected  powers  of  the  spherical  distance  from  the  mass  center 
(0cm.  9cm)  can  be  defined  by 


mq  = 


(3-98) 


^  d?  ($p>  6p»  0cm»  ®cm) 

I  Gap 

To  conclude  our  discussion  of  the  EGI  and  DEGI,  moment  features  like  that  defined  by 
(3-98)  constitute  an  additional  class  of  potential  inputs  for  neural-network-based  mine 
classifiers. 

The  above  discussion,  motivated  by  ideas  from  differential  geometry  and  the 
3D  object  recognition  literature,  has  centered  on  the  identification  of  curvature  and 
surface  normal-related  features  for  use  in  mine  classification.  The  discussion  that 
follows  will  focus  on  a  review  of  selected  techniques  for  the  fitting  of  surface 
representations  in  order  to  support  the  calculation  of  the  above-described  features. 
The  majority  of  the  various  surface  representation  techniques  available  from  the 
approximation  theory  and  computer  graphics  literature  divide  into  interpolator 
approaches  (denoted  here  by  the  abbreviation  IA)  in  which  the  approximant  function 
exactly  reproduces  the  data  at  specified  points,  and  least-squares-based  approaches 
(denoted  here  by  the  abbreviation  LSA),  in  which  the  approximant  function  solves  a 
least  squares  optimization  problem.  In  addition,  techniques  can  be  categorized  based 
on  their  ability  or  inability  to  handle  nonuniformly  spaced  data  (denoted  here  by  the 
abbreviations  NUDA,  UDA,  respectively).  The  selected  approaches  described  here, 
together  with  their  categorizations  based  on  the  above  defined  abbreviations  are  as 
follows: 

(1)  Regression  using  discrete  orthogonal  polynomials  (Ref.  [23])  (LSA, 
UDA). 

(2)  Moving  weighted  least-squares-based  surface  representation  (Ref. 
[24])  (LSA  or  IA,  NUDA). 

(3)  Polynomial-spline-based  surface  representation  (Ref.  [24])  (IA, 
NUDA). 

(4)  Smoothing-spline-based  surface  representation  (Refs.  [25]  and  [26]) 
(LSA,  NUDA). 

Before  reviewing  surface  representation  approaches  (1)  -  (4)  defined  above,  it 
is  worth  noting  that  in  the  application  of  any  of  the  techniques,  fundamental  decisions 
must  be  made  concerning: 

(i)  The  manner  in  which  the  3D  surface  will  be  parameterized,  i.e.,  through 
a  general  parametric  representation  for  each  coordinate  function  like 
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that  defined  in  (3-66),  or  through  the  more  specialized  Monge  surface 
representation  implied  by  (3-68)  and  (3-69). 

(ii)  The  manner  in  which  surface  representations  will  be  employed  to 
calculate  curvature  and  surface  normal  related  features. 

In  the  case  of  (i),  the  use  of  the  more  general  parametric  surface  representation 
implies  the  need  to  solve  a  surface  representation  problem  for  each  coordinate 
function.  Approaches  for  constructing  a  discrete  set  of  2D  parameter  vectors  Uy  to  be 
associated  with  a  discrete  collection  of  3D  vector  values  fy  are  described  in  Refs.  [27] 
and  [28].  In  the  case  of  (ii),  there  exist  two  alternative  approaches: 

(a)  3D  vector  values  fy  over  some  collection  of  processing  windows,  each 
denoted  by  Re,  can  be  used  to  construct  a  single  surface  representation 
over  each  Re,  which  can  then  be  differentiated  and  employed  to  obtain 
curvature  and  surface  normal  related  features  at  every  enclosed  pixel. 

(b)  By  implicitly  solving  surface  representation  problems  over  windows, 
RE.i.j  centered  about  each  pixel,  2D  convolutional  filters  may  be 
determined  for  the  explicit  calculation  of  first  and  second  derivative 
information  necessary  to  support  calculation  of  curvature  and  surface 
normal-related  features.  (This  approach  is  only  an  option  when  the 
distribution  of  data  points  is  uniform  in  the  parameter  space.) 

The  case  (a)  above  enjoys  the  philosophical  advantage  of  corresponding  to  the 
representation  of  a  single  well-defined  approximating  surface  for  each  processing 
window,  Re-  However,  the  speed  with  which  convolutional  filtering  operations  may  be 
performed  suggests  a  substantial  computational  advantage  to  (b). 

The  above  discussion  suggests  that  the  adoption  of  a  surface  modeling 
approach  based  on  a  Monge  representation  and  a  feature  calculation  approach  based 
on  case  (b)  are  favored  from  a  computational  standpoint.  Hence,  most  research 
reported  in  the  3D  object  recognition  literature  makes  these  assumptions.  However, 
the  work  reported  in  Ref.  [27]  has  made  use  of  the  more  general  3D  surface 
parameterization,  as  well  as  a  feature  calculation  approach  based  on  case  (a). 

The  discussion  that  follows  will  review  surface  representation  approaches  (1)  - 
(4)  mentioned  above  in  the  context  of  approximating  a  2D  function  F(x,y),  for  which 
discrete  values  Fy  are  available  on  the  discrete  grid  (x;,  yj )  i  =  1. .  Nx,  j  =  1  . . .  Ny.  In 
the  discussion  of  approach  (1),  regression  using  discrete  orthogonal  polynomials,  to 
;  consistent  with  Refs.  [23]  and  [16],  it  is  assumed  that  the  (x;,  yj)  grid  is  uniform  in 
both  coordinates,  and  that 
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(3-99) 

(3-100) 


Nx  =  2MX  +  1 
Ny  =  2My  +  1 


Given  (3-99)  and  (3-100),  it  is  possible  to  define  the  shifted  discrete  coordinate 
parameters 


fk  =  i-(Mx+l) 

\/  =  j-(My  +  l)  (3-101) 

(3-102) 

and  the  discrete  data  field  Gk, /defined  by 

_  c  (  k  =  -Mx . .  .Mx 

Gk,/  =  Fk+(M„  +  l),/+(My  +  l)/ 

\  /  =  -My  .  .  .My  (3-103) 

Hence,  the  (k,  /)  =  (0,  0)  point  corresponds  to  the  center  of  the  originally  defined 
discrete  data  field,  Fy. 

Now,  let  <t>xq  (•),  (j)^  (•)  q  =  0  ...  mx,  r  =  0  ...  my  denote  orthogonal  polynomials 
associated  with  x,  y  coordinate  directions,  i.e. 


X  ^xqi  (u)  ^xqz  (u)  -  N’Xqj  Oq(,  q2 

u=-M,  (3-104) 

My 

^  0yn  (v)  0yr2  (v)  =  Nyr,  Sr!,  r2 

v=-My  (3-105) 

Then  the  least-squares  approximant  Gk,/  for  the  Gk,/  field,  when  represented  in  terms 
of  the  basis  [<>xq(*)<t>y  r(*)]  functions,  can  be  expressed  as 


where 


Gk.  /  =  2^  **qj  4*xq  M  4fyr  (4 

q.r 


(3-106) 


33 


(3-107) 


X  Gk,/<t>xq(k)<t>yr(/) 

V  =  (N„  •  N„) 

If  (k,  /)  are  now  viewed  as  continuous  parameters,  (3-106)  can  be  differentiated  to 
provide  expressions  for  first  and  second  derivatives.  In  particular,  by  using  the 
resulting  expression  for  k  =  0,  /=  0,  we  obtain  convolutional  filter  forms  for  calculating 
first  and  second  derivatives  mentioned  in  the  above  discussion.  Ref  [16]  provides 
specific  examples  of  <J>Xq(*),  <t>yr(*)  for  the  case  of  discrete  orthogonal  polynomials  that 
are  no  higher  than  second  order,  i.e.,  at  most  quadratic. 

The  orthogonal  polynomial  regression  based  surface  representation  technique 
described  above  is  very  attractive  in  a  computational  sense,  but  is  unable  to  handle 
the  problem  of  scattered,  unevenly  distributed  data.  The  moving  weighted  least- 
squares-based  surface  representation  approach  described  next  can  handle 
nonuniformly  distributed  data,  but  at  the  price  of  requiring  the  solution  to  a  general 
linear  least-squares  problem,  and  in  some  cases  demanding  the  solution  of  a  new 
least-squares  problem  for  each  point  at  which  an  approximant  value  is  desired.  Let  z 
denote  a  vector  representing  a  2D  point,  and  let  b(z)  denote  a  Q-dimensional  vector 
valued  function  whose  components  correspond  to  the  2D  basis  functions  to  be 
employed  in  the  formulation  of  the  least-squares  problem,  e.g.,  xPi  xP2  for  0<  pi  +  p2 
<2  specifies  a  case  for  which  Q  =  7.  In  addition,  let  zm  m  =  1  ...  NxNy  denote  the 
collection  of  2D  (x;,  yj)  pairs  and  F  denote  the  corresponding  vector  of  associated 
function  values  made  up  of  Fy's.  Then  defining  the  matrices 

:  •••  :  b(zNlNy))  (3.108) 

ill)  \ 

w'(l  Y  -  ZN,  nJ  I);  (3-109) 

where  w(»)  is  some  non-negative  weighting  function,  the  approximant  function  at  z, 
F  (z)  is  expressed  as 


W(z)A| 


(w(||z- 


F  (z)  =  b'(z)  a  (z) 


(3-110) 
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where  a  (z)  is  the  Q  vector  solution  of  the  linear  least-squares  optimization  problem 
expressed  by 


inf{  ||W1/2(Z)  [B'  a  -  F]||2} 


Recall  that  the  solution  of  a  generic  linear  least-squares  optimization  problem  was 
discussed  before  in  the  context  of  2D  autoregressive  model  fitting,  through  (3-26)  and 
(3-29).  The  type  of  surface  approximant  obtained  in  (3-110)  varies  with  the  selection 
of  the  weighting  function  w(*).  In  the  case  when  w(»)  is  a  non-negative  constant,  the 
approximant  represents  an  ordinary  least-squares  surface  fit,  and  a  (z)  is  not 
dependent  on  z.  In  the  case  when  w(«)  is  a  non-negative  function,  bounded  at  0,  e.g., 

w(5)  =  ci  e-£2/c2  (3-112) 

then  the  approximant  represents  a  least-squares  fit  that  weights  data  points  according 
to  their  distance  from  the  point  of  interest.  In  the  case  when  w(»)  is  a  non-negative 
function  unbounded  at  0,  e.g., 

w(^)  =  [ci  e-|42/C2)]/  ^  (3- 1 1 3) 

the  approximant  interpolates  at  the  data  point  and  represents  a  least-squares  fit  off 
the  data  points.  Finally,  it  should  be  noted  that  the  smoothness  properties  of  the 

xv 

fitted  surface  F  (z)  varies  with  the  smoothness  properties  of  the  b;  (z)'s  and  the 
choice  of  w(»).  In  the  case  when  the  bj(z)'s  are  2D  polynomials  and  w(^)  =  l/^2k  for  k 
>  1,  F  (z)  has  been  shown  to  be  infinitely  continuously  differentiable. 

In  general,  the  weighted  moving  least-squares-based  surface  representation 
approach  described  above  requires  the  solution  of  a  new  least-squares  problem  for 
each  new  data  point  z  at  which  F  (z)  is  desired.  In  addition,  the  calculation  of  first  and 

second  spatial  derivatives  of  F  (z)  requires  the  calculation  of  corresponding 

,  /*..  . 

derivatives  of  a(z),  which  in  general  must  be  calculated  through  the  solution  of  linear 
systems  of  equations  as  discussed  in  Ref.  [24]. 

Approach  (2)  as  described  above  may  be  useful  as  a  data  gridding  approach, 
i.e.,  an  approach  for  obtaining  a  uniformly  distributed  data  set  from  an  irregular  or 
scattered  data  set.  In  this  case,  the  application  of  approach  (2)  might  be  followed  by  a 
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more  computationally  attractive  interpolator  technique  for  the  calculation  of  required 
surface  derivatives. 

Surface  representation  approaches  (1)  and  (2)  described  above  adopted  a 
least-squares  approximation  formulation  in  a  discrete  and  continuous  space  setting, 
respectively.  The  use  of  a  least-squares  approximation  approach  is  motivated  by  the 
practical  problem  of  fitting  surfaces  to  noisy  data.  In  situations  where  data  noise 
effects  are  small,  or  the  data  has  been  smoothed  (see  Ref.  [29])  or  gridded  using  a 
least-squares  approach  like  that  described  in  approach  (2)  above,  the  use  of  an 
interpolator  approach  like  the  polynomial  spline  surface  representation  approach  (3) 
may  be  justified.  In  the  discussion  that  follows,  two  approaches  to  the  fitting  of  cubic 
polynomial  splines  are  described,  based  on  the  use  of  cardinal  splines  and  B-splines, 
respectively. 

Both  the  cardinal  spline  and  B-spline  approaches  are  best  introduced  first  in  a 
ID  curve  representation  setting.  The  cardinal  spline  approach  is  based  on 
representing  a  function  of  x,  with  associated  discrete  function  values  fx;  i  =  1  .  .  .  Nx, 
by  an  approximant  of  the  form  sn>(p)>  where 


N.  N. 

SlDip)  =  2  <J>*i  (p)  f*i  +  2  Vx,  (p)  mxi 

i=i  i=i  (3-114) 

where  the  mxi  denotes  slope  values,  and  the  functions  Ox;  (•),  H'x;  (•)  are  piecewise 
cubic  functions  with  support  on  [xj.i,  x  j+i]  and  satisfying  the  following  conditions: 

=  5ilt  j2  ^'^(xtf)  =  0  (3-115) 

't\(*i!)  =  0  (Xij  =  Si,.  b  (3-116) 

Explicit  expressions  for  d>Xj(»),  'Fxj  (•)  are  obtained  in  Ref.  [24],  Letting  mx,  fx  denote 
vectors  of  mx;,  fxj  values  respectively,  then  the  ID  cardinal  spline  fitting  approach 
involves  the  solution  of  a  linear  system  of  the  form 

Axmx  =  [CxfJ  (3-117) 

in  order  tc  guarantee  the  continuity  of  second  derivatives  of  sio(p)  at  interval 
endpoints.  The  Ax,  Cx  matrices  are  tridiagonal  matrices,  entirely  dependent  on  the 
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spacing  of  data  points,  and  in  part  reflecting  the  form  assumed  for  boundary  conditions 
at  the  endpoints  xt,  xN^.  Note,  that  tridiagonal  systems  of  linear  equations  of  the  form 

in  (3-117)  can  be  solved  by  specialized  algorithms  involving  0(NX)  operations  ([8]). 
Letting  0Xi  denote  the  i-th  column  of  the  matrix  Ox  defined  by 

Ox  =  Ax  Cx  (3-118) 

then  sio(p)  can  in  view  of  (3-114),  (3-117),  and  (3-118)  be  expressed  as 

N, 

?id(P)  =  X  H,j(p)fxj 

i-l  (3-119) 


where 


Hxi  (p)  £  ['Fxi  (p)  •  •  •  4* xN*(p)]  Oxi  +  cDxi(p)  (3-120) 

The  Hxi(»)'s  are  referred  to  as  cardinal  functions.  By  following  through  an 
analogous  argument  to  the  above  for  curve  fitting  along  the  y  coordinate  direction 
Hyj(*)’s  may  defined  for  j  =1  .  .  .  Ny.  Given  Hxj(»)’s  and  Hyj(»)’s  the  following  2D 
spline  approximant  S2d(P>ti)  may  be  defined  for  the  originally  defined  2D  surface 
representation  problem: 


S2D (p»  ^l)  Fi,  j  Hxi (p)  Hyj  (tj)  (3-121) 

Note  that  (3-121)  defines  a  surface  representation  with  the  property  that  derivatives 
up  to  second  order  in  each  variable  are  continuous.  Hence,  (3-121)  may  be 
differentiated  in  order  to  obtain  expressions  for  desired  first  and  second  derivatives. 

The  cardinal  spline  approach  is  attractive  since  in  cases  when  the  cardinal 
functions  and  their  derivatives  can  be  precomputed,  the  operation  of  calculating  spline 
approximant  function  and  derivative  values  simply  involves  the  evaluation  of 
expressions  of  the  form  in  (3-121).  In  the  case  when  the  distribution  of  data  is  uniform 
in  both  variables,  convolution  filter  forms  for  calculating  approximant  function  and 
derivative  values  at  the  center  of  a  window  can  be  obtained  from  (3-121). 
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The  cardinal  spline  approach  suffers  from  the  disadvantage  that  the  evaluation 
of  expressions  of  the  form  in  (3-121)  in  general  requires  the  summation  of  all  Nx  •  N, 
terms.  The  B-spline  approach  avoids  this  pitfall  by  assuming  finite  support  basis 
functions.  The  B-spline  approach  for  representing  a  function  of  x  is  based  on  an 
approximant  of  the  form  Sidb(p)  where 


N,-l 

sidb(p)  =  I  “ i  BXi  (p) 

j  =-2  (3-122) 

where  Bxj(»)  denotes  a  piecewise  cubic  basis  function,  twice  continuously 
differentiable  with  support  on  [xt,  x;  44].  It  should  be  noted  that  the  representation  in 
(3-122)  requires  the  definition  of  knots,  x_2,  x_i,  xo  and  xnx+i,  Xnx+2,  xnx+3  exterior  to 
the  approximant  representation  interval  [xj,  xNJ.  The  Bx;(»)  functions  can  be 

computed  by  recursions  specified  in  Ref.  [24], 

Next,  B-spline  basis  functions  Byj  (p)  with  support  on  [Yj,  Yj+4]  for  j  =  -2  .  .  . 
Ny-i  may  be  defined  and  the  B-spline  approximant  for  our  original  surface 
representation  problem  may  be  defined  as  S2DB  (p,  Tj) ,  where 

Nx  -1  Nyl 

s1Db(P)  =  ^  S  ®i,  j  ®xi  (P)  Byj  ("Hi 

>=-2  j  =  -2  (3-123) 

The  a,j  coefficients  in  the  surface  representation  (3-123)  must  be  determined  by 
imposing  a  total  of  (Nx+2)  •  (Ny+2)  linear  constraints.  These  constraints  can  be 
defined  in  a  variety  of  ways;  two  options  include: 

•  Matching  function  values  defined  at  (x,  y)  knots  over  the  enlarged  region 

[xo,  xNx+i]  x[yo,  yNy+i] 

•  Matching  function  values  defined  at  (x,  y)  knots  over  [xi,  xnx]  x  [yj, 
yNy],  setting  second  partial  derivatives  with  respect  to  x  equal  to  zero 
along  x  =  xj,  x  =  xnx  for  y  e  [yj,  yNy]>  setting  second  partial  derivatives 
with  respect  to  y  equal  to  zero  along  y  =  yi,  y  =  yNy  for  x  e  [xj,  XNy],  and 
finally  setting  the  mixed  second  partial  derivative  with  respect  to  x,  and 
second  partial  derivative  with  respect  to  y  equal  to  zero  at  the  "comer” 
points:  (xb  yO,  (xNx,  yO,  (xlf  yNy),  xNx,  yNy). 

In  either  of  the  above  cases,  imposing  the  described  constraints  results  in  a  system  of 
linear  equations  of  the  form 
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L  =  Qxafly 


(3-124) 


where  L  contains  the  constraint  function  or  derivative  values,  flx,  Qy  are  banded 
matrices  that  include  appropriate  function  or  derivative  values  associated  with  x-axis 
and  y-axis  B-spline  basis  functions,  respectively,  and  a  contains  the  ay  values. 
Hence,  a  can  be  obtained  by  solving  two  successive  sets  of  linear  systems  indicated 
by  the  grouping  of  terms  of  the  below  equation: 


a  =  [q;1  l]  (o')'1 


(3-125) 


Given  the  calculation  of  a  in  (3-125),  the  representation  (3-123)  can  be 
differentiated  in  order  to  obtain  expressions  for  first  and  second  derivative  values. 
Due  to  the  fact  that  the  Bxj  («)’s  and  Byj  («)’s  have  supports  spanning  four  intervals, 
the  evaluation  of  the  approximant  ^DBfp.'H)  at  most  requires  the  summation  of  16 
items.  Finally,  note  that  in  cases  when  the  (x,,  yj)  data  grid  is  uniform,  (3-123)  can  be 
used  in  conjunction  with  (3-125)  in  order  to  determine  convolutional  filter  forms  for 
evaluating  first  and  second  derivatives  of  the  approximant  surface,  at  the  center  of  the 
data  window  (see  Ref.  [17]). 

The  use  of  an  interpolator  spline-based  surface  representation  approach  can 
be  criticized  from  the  perspective  of  ignoring  data  uncertainty.  Hence,  in  approach  (4), 
the  use  of  spline-based  surface  representations  and  least-squares  optimization  is 
combined.  One  example  of  this  approach  involving  polynomial  splines  assumes  a 
surface  representation  of  the  form  in  (3-123).  Next,  it  is  assumed  that  a  finer 
"measurement"  data  grid  is  superimposed  on  the  surface  representation  grid  defined 
by  the  (xj,  yj)  pairs,  which  includes  those  points  as  well  as  others.  In  this  context,  the 
least  squares  surface  fitting  problem  can  be  formulated  (see  Ref.  [15])  as  the  problem 
of  minimizing 


|^-n,oS^E 


(3-126) 


where  I H  lF  denotes  the  squared  Frobrenius  matrix  norm  corresponding  to  the  sum  of 
the  squares  of  matrix  elements,  and  L,  Qx,  fiy  have  analogous  interpretations  to  L, 
flx.  fly  in  (3-124).  The  solution  of  the  optimization  over  a  in  (3-126),  denoted  by  a  , 
can  be  expressed  as 
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a  =  [(aj*  l]  (Oyf 


(3-127) 


where  #  denotes  the  Moore-Penrose  pseudo-inverse.  The  result  in  (3-127)  can  be 
obtained  by  solving  two  successive  sets  of  linear  least-squares  problems  indicated  by 
the  grouping  of  terms. 

A  final  smoothing  spline-based  surface  representation  approach  described  here 
is  the  nonpolynomial  spline  under  tension  approach  discussed  by  Cline  in  Ref.  [26], 
and  employed  to  support  curvature  related  calculations  in  the  3D  object  recognition 
work  of  Vemuri,  Mitche,  and  Aggarwal  in  Ref.  [27].  Cline’s  approximant  function  F  (p, 
T|)  is  the  solution  of  a  variational  problem  defined  minimizing 


r 

n 

i 

i 

N,-l  Nj 

+  a2 

I  1 

*•■11  f  T  “'[f,,  (p\  t|')  •  Fxyd.  i.  jf  dP'dTl' 

i=l  j  =  l  x*  r> 


(3-128) 


where 

£  A  (f  (Xj  + 1 ,  yj-t-i)  -  F{Xj ,  yj+1)  -  F(xi+ ! ,  yj)  +  F(Xi ,  yj)j 
xyd,1,J  ”  (xi + 1  -  xi)  (yj+i  -  yj)  (3-129) 

and  subject  to  the  constraint: 

U  '  /  (3-130) 

The  S  parameter  controls  the  amount  of  "tension"  associated  with  the  resulting 
surface  representation  and  the  5xj,  8yj  are  associated  with  representing  the  size  of 
data  uncertainties.  In  brief  terms,  the  solution  of  the  above  variational  problem 
involves 

/s  ^ 

•  The  solution  of  linear  equations  for  Fxxyy  (•,  •),  Fxx  (•,  •),  Fjy(»,  •)  values 
at  grid  points,  which  are  coupled  with  a  single  nonlinear  equation 
related  to  the  satisfaction  of  (3-130),  and  whrh  then  allows  the 
construction  of  F(*,  •)  values  at  grid  points. 
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•  The  use  of  a  2D  Hermite  interpolation  formula  to  represent  F  (p ,  Tj ) 
values  given  the  calculated  F(»,  •),  Fxxyy  (•,  •),  Fxx  (•,  •),  Fyy  (*,  •) 
values  at  grid  points. 

The  2D  Hermite  interpolation-based  surface  representation  mentioned  above  may 
then  be  differentiated  to  obtain  first  and  second  derivative  values  at  any  desired 
points. 

The  above  discussion  has  reviewed  four  alternative  approaches  to  surface 
representation,  which  can  be  employed  to  support  curvature,  and  surface  normal 
related  feature  calculation  for  mine  classification.  Approach  (2),  the  moving  weighted 
least-squares  surface  representation,  may  be  most  attractive  for  use  in  gridding  non- 
uniform  data  sets,  which  can  then  be  processed  using  the  more  computationally 
attractive  orthogonal  polynomial  regression  or  polynomial  spline-based  surface 
representation  techniques  of  approaches  (1)  or  (3)  in  calculating  desired  first  and 
second  derivatives.  Approach  (4),  while  having  the  advantage  of  combining  spine- 
based  surface  representation  techniques  and  a  least-squares  approximation  approach, 
appears  to  require  very  complex,  computationally  intensive  algorithms. 

The  focus  of  the  preceding  development  has  been  on  the  use  of  general  surface 
representation  techniques  in  order  to  calculate  curvature  and  surface  normal  related 
features,  as  motivated  by  differential  geometry  and  the  3D  object  recognition 
literature.  In  the  present  discussion,  we  recognize  the  fact  that  mines  frequently  have 
specific  forms,  i.e.,  cylindrical  or  spherical,  and  show  how  a  more  specialized  quadric 
surface  fitting  technique  introduced  in  Ref.  [30]  may  be  employed  to  calculate  shape- 
related  features  for  use  in  mine  classification. 

A  quadric  surface  is  represented  by  an  implicit  function  of  three  spatial 
variables  of  the  form 


rnf+f*v  +  d  =  o  (3-i3i) 

where  f  denotes  a  vector  of  x,  y,  z  coordinate  values,  v  denotes  a  constant  vector,  and 
d  a  scalar  constant.  In  the  case  of  a  central  quadric,  i.e.  an  ellipsoid  or  hyperboloid, 
(3-131)  can  be  simplified  into  the  form 


f*'  Q  f  +  y  =  0 


(3-132) 
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where 


and 


f*  =  f  +  c 


(3-133) 


(3-134) 


Y  = 


d  -  -  v '  Q*1  v 


(3-135) 


Spheres  and  cylinders  may  be  represented  as  special  ellipsoidal  surfaces.  In  the 
above  context,  the  suggested  shape-related  features  for  use  in  mine  classification  are 
the  eigenvalues  of  (f2/y),  i.e.  Xi{f2/Y)  i  =  1,  2,  3.  These  eigenvalues  are  invariant 
under  3D  rotational  transformations  for  the  standardized  central  quadric  surface  form 
obtained  by  dividing  both  sides  of  (3-132)  by  y. 

At  this  point,  the  quadric  surface  fitting  approach  suggested  in  Ref.  [30]  is 
described.  The  fitting  process  is  formulated  in  terms  of  ten  parameters  ai  .  .  .  aio 
which  are  related  to  Q,  v,  and  d  as  follows: 


(at 

a4/VI 

a5/V2  | 

Q  = 

aVY2 

a2 

36/^2 

U5/V2 

36/^2 

a3  ) 

(3-136) 

/a7\ 

V  = 

as 

\  a 9  / 

(3-137) 

d  = 

aio 

(3-138) 

In  addition,  let 


/aM 


\aio  / 


(3-139) 
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and  assuming  a  collection  of  3D  data  points  fy  the  fitting  problem  is  formulated  as  the 
optimization  of  E  defined  as 


E=X  [f-JQfi.J  +  fi.J-v  +  d]2 


over  the  p  vector  variable,  subject  to  the  constraint  that 


INI  =  i 


(3-140) 


(3-141) 


The  condition  (3-141)  guarantees  a  nontrivial  solution  to  the  optimization  over  E. 

To  define  the  solution  of  the  above  constrained  optimization  problem,  note  that 
individual  terms  in  the  summation  in  (3-140)  can  be  expressed  as 


[fijQfi.j  +  fi.j-v  +  d]2  =  p' 


Bid  Q.j  | 

c'ij  i>,j  r 


(3-142) 


for  appropriately  defined  matrices  By  (6  x  6),  Cy  (6  x  4),  Dy  (4  x  4).  Next,  defining 


Ba  =  5Xj 
Ca  =  £  Q,  j 
Da  =  XXj 


(3-143) 

(3-144) 

(3-145) 


then,  from  Ref.  [30],  the  optimal  p  vector  partitions,  pi ,  P2  are  defined  as 


Pi 


eigenvector  of  (®a  -  Q  Da*  CJ  corresponding  to 
the  minimum  eigenvalue,  Da  CJ 


P2  =  D  j1  Ca  Pi 


(3-146) 


(3-147) 


Finally,  note  that  the  minimum  value  of  the  optimization  criterion  E,  Emin,  can  be 
expressed  as 
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(3-148) 


Emin  —  ^-min  (B»  *  CaDa1  Ca) 


The  above  discussion  defines  an  efficient  procedure  for  the  least-squares  fitting 
of  quadric  surfaces  that  avoids  the  need  for  the  use  of  nonlinear  optimization 
techniques.  Since  Emjn  indicates  the  quality  of  the  fit,  it  is  suggested  here  that  {Emin, 
Xi  (O/y)  i  =  1,2,3}  be  the  complete  set  of  quadric  surface  derived  features  for  use  in 
mine  classification. 


4.0  Conclusions 

The  preceding  development  has  considered  a  wide  variety  of  features  for  use  in 
mine  classification,  derived  from  3D  depth  map  information.  These  feature  calculation 
concepts  are  summarized  in  Figure  (4-1).  In  the  case  of  low-resolution,  telesounder- 
derived  depth  maps,  the  focus  of  our  discussion  was  on  the  use  of  shadow  information 
to  calculate  mine  size-related  features,  motivated  by  the  use  of  shadows  by  human 
interpreters  of  side-scan  sonar  data.  In  the  case  of  high-resolution,  swath-bathymetry 
derived  depth  maps,  both  stochastic  and  deterministic  depth  map  model  based  feature 
calculation  concepts  were  identified.  The  stochastic  depth  map  model  based  feature 
calculation  concepts  were  motivated  by  the  literature  on  texture  measures,  and 
random  field  based  approaches  to  image  processing.  Finally,  the  deterministic  depth 
map  model  based  feature  calculation  concepts  were  motivated  by  ideas  from 
differential  geometry  and  the  3D  object  recognition/computer  vision  literature. 
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Low-Resolution  Depth  Map 
Feature  Calculation  Concepts 


(1)  Mine  Diameter  Estimation  Using  Side-Scan 
Sonar  and  Depth  Map  Information _ 


T  |«  D  ^ 
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Feature  Calculation  Concepts: 

Stochastic  Model-Based 
Concepts 


(1)  Autocar-elation  Estimates 

(2)  Spectral  Density-Related  Quantities 

(3)  2D  Autoregressive  Model  Coefficients 

(4)  Normalized  Prediction  Error  Residuals 
Based  on  Fitted  2D  Autoregressive  Model 

(5)  Spatial  Grey-Level  Difference  Texture 
Measures 

(6)  Absolute  Grey-Level  Difference  Texture 
Measures 

(7)  Grey-Level  Run  Length  Texture  Measures 


High-Resolution  Depth  Map 
Feature  Calculation  Concepts: 

Deterministic  Model-Based 
Concepts 


(8) 

121 

(1) 

(2) 

(3) 

(4) 


Fractal  Dimension  Estimates 

Wavelet  Transform-Related  Quantities 
Mean  and  Gaussian  Curvature-Related 
Quantities 

Hough-Transform-Related  Quantities 
Derived  From  Surface  Normal  Information 

Discrete  Extended  Gaussian  Image 
Derived  Quantities  Obtained  from  Surface 
Normal  and  Curvature-Related  Information 

Quadric  Surface  Fit-Related  Quantities 


45 


REFERENCES 


[  1  ]  Denbigh,  Phillip,  "Swath  Bathymetry:  Principles  of  Operation  and  Analysis  of 
Errors,"  IEEE  Journal  of  Ocean  Engineering,  Vol.  14,  No.  4,  October  1989, 
pp.  289-298. 

[2]  NAVSEA  Mine  Familiarizer,  Naval  Mine  Warfare  Engineering  Activity, 
Yorktown,  Virginia,  April  1985. 

[3]  Conners,  Richard  and  Charles  Harlow,  "A  Theoretical  Comparison  of  Texture 
Algorithms,"  IEEE  Transactions  on  Pattern  Analysis  and  Machine 
Intelligence,  Vol.  2,  No.  3,  May  1980,  pp.  204-222. 

[4]  Weszka,  Joan  et  al.,  "A  Comparative  Study  of  Texture  Measures  for  Terrain 
Classification,"  IEEE  Transactions  on  Systems,  Man,  and  Cybernetics,  Vol.  6, 
No.  4,  April  1976,  pp.  269-285. 

[5]  Haralick,  Robert  et  al.,  "Textural  Features  for  Image  Classification,"  i^EE 
Transactions  on  Systems,  Man  and  Cybernetics,  Vol.  3,  No.  6,  November  1973, 

pp.  610-621. 

[6]  Therrien,  C.  W.  et  al.,  "Statistical  Model-Based  Algorithms  for  Image 
Analysis,”  Proceedings  of  the  IEEE,  Vol.  74,  No.  4,  April  1986,  pp.  532-551. 

[7]  Lawson,  Charles  and  Richard  Hanson,  Solving  Least  Squares  Problems, 
Prentice  Hall  Inc.,  Englewood  Cliffs,  New  Jersey,  cl974. 

[8]  Golub,  Gener  and  Charles  Van  Loan,  Matrix  Computations,  John  Hopkins 
University  Press,  Baltimore,  Maryland,  cl983. 

[9]  Stein,  Michael,  "Fractal  Image  Models  and  Object  Detection,"  SPIE,  Vol.  845, 
Visual  Communications  and  Image  Processing  -  II,  1987. 

[10]  Peitgen,  Heniz  and  Dietmar  Saupe  (ed.).  The  Science  of  Fractal  Images, 
Springer- Verlag,  New  York,  c!988. 


46 


[11]  Mallat,  Stepnare,  "A  Theory  for  Multiresolution  Signal  Decomposition:  The 
Wavelet  Representation,"  Vol.  11,  No.  7,  July  1989,  pp.  674-693. 

[12]  Strang,  Gilbert,  "Wavelets  and  Dilation  Equations:  A  Brief  Introduction," 
SIAM  Review ,  Vol.  31,  No.  4,  December  1989,  pp.  614-627. 

[13]  Besl,  P.  J.  and  R.  C.  Jain,  "Three-Dimensional  Object  Recognition,"  ACM 
Computing  Surveys ,  Vol.  17,  No.  1,  March  1985,  pp.  75-145. 

[14]  Lipschutz,  Martin,  Theory  and  Problems  of  Differential  Geometry,  McGraw 
Hill  Book  Co.,  New  York,  1969. 

[15]  Besl,  P.  and  R.  Jain,  "Intrinsic  and  Extrinsic  Surface  Characteristics," 
Proceedings  of  the  Computer  Vision  and  Pattern  Recognition  Conference,  San 
Francisco,  California,  June  9-13,  1985,  pp.  226-233. 

[16]  Besl,  P.  and  R.  Jain,  Surface  Characterization  for  Three  Dimensional  Object 
Recognition,  Report  RSD-TR-20-84  Electrical  Engineering  and  Computer 
Science  Department,  University  of  Michigan,  Ann  Arbor,  December  1984. 

[17]  Yang,  H.  and  A.  Kak,  "Determination  of  the  Identity,  Position,  and  Orientation 
of  the  Topmost  Object  in  a  Pile,"  Computer  Vision,  Graphics  and  Image 
Processing,  Vol.  36,  1986,  pp.  229-255. 

[18]  Sethi,  I.  and  S.  Jayaramamurthy,  "Surface  Classification  using  Characteristic 
Contours,"  Proceedings  of  the  7-th  International  Conference  on  Pattern 
Recognition,  Montreal,  Canada  July  30  -  August  2  1984,  IAPR  and  IEEE 
Press,  New  York,  pp.  438-440. 

[19]  Ballard,  Dana  and  Christopher  Brown,  Computer  Vision,  Prentice  Hall  Inc., 
Englewood  Cliffs,  New  Jersey,  1982. 

[20]  Herbert,  M.,  and  J.  Ponce,  "A  New  Method  for  Segmenting  3D  Scenes  into 
Primitives,"  Proceedings  of  the  6-th  International  Conference  on  Pattern 
Recognition,  Munich,  West  Germany  October  19-22,  1982,  IAPR  and  IEEE 
Press,  New  York,  pp.  836-838. 


47 


[21]  Horn,  B.,  "Extended  Gaussian  Images,"  Proceedings  of  the  IEEE ,  Vol.  72,  No. 
12, 1984,  pp.  1671-1686. 

[22]  Horn,  B.,  Robot  Vision,  MIT  Press  &  McGraw  Hill  Book  Co.,  Cambridge, 
Mass,  and  New  York,  cl986. 

[23]  Haralick,  R.,  "Digital  Step  Edges  from  Zero  Crossing  of  Second  Directional 
Derivatives,"  IEEE  Transactions  on  Pattern  Analysis  and  Machine 
Intelligence,  Vol.  6,  No.  1,  January  1984,  pp.  58-68. 

[24]  Lancaster,  P.  and  K.  Salkauskas,  Curve  and  Surface  Fitting:  An  Introduction, 
Academic  Press,  New  York,  cl986. 

[25]  Cox,  M.,  "Data  Approximation  by  Splines  in  One  and  Two  Independent 
Variables,"  pp.  112-138  in  A.  Iserles  and  M.  Powell  (eds.),  The  State  of  the 
Art  in  Numerical  Analysis,  Clarendon  Press,  Oxford,  cl 987. 

[26]  Cline,  A.  K.,  Surface  Smoothing  with  Splines  under  Tension,  Report  CNA-170 
Department  of  Computer  Science,  University  of  Texas;  Austin,  Texas,  January 
1981. 

[27]  Vemuri,  B.  et  al.,  "Curvature-Based  Representation  of  Objects  from  Range 
Data,"  Image  and  Vision  Computing,  Vol.  4,  No.  2,  1986,  pp.  107-114. 

[28]  Farin,  Gerald,  Curves  and  Surfaces  for  Computer  Aided  Geometric  Design:  A 
Practical  Guide,  Academic  Press,  Boston,  cl 988. 

[29]  Fan,  T.  et  al.,  "Segmented  Descriptions  of  3-D  Surfaces,"  IEEE  Journal  of 
Robotics  and  Automation,  Vol.  3,  No.  6,  December  1987,  pp.  527-538. 

[30]  Faugeras,  O.,  and  M.  Herbert,  "The  Representation,  Recognition,  and  Locating 
of  3D  Objects,"  International  Journal  of  Robotics  Research,  Vol.  5,  No.  3,  1986, 
pp.  27-52. 


48 


